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Exegesis of the 1201 and 1202 Alarms Which Occurred 
During the Mission G Lunar Landing 


During the mission G lunar landing five LGC Executive overflow 
alarms occurred. They were with approximate times of occurrence, 

1202 at PDI + 316 seconds 
1202 at PDI + 356 seconds 

1201 at PDI + 552 seconds 

1202 at PDI + 578 seconds 
1202 at PDI + 594 seconds 

The first two alarms occurred in P63; the last three occurred in P64. A 
later section of this letter gives more detail on the other DSKY activity 
during landing. 

■ • These alarms arose because of a computer overload caused by a 
high frequency train of "counter increment requests" from the RR ECDUs 
Each "counter increment request" requires one LGC central processor 
memory cycle. Consequently, the other computational tasks and jobs of 
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the LGC are effectively slowed down. The slow-down was ‘s‘6 bad at times 
that some repetitively scheduled jobs (like SERVICER, the navigation and 
guidance job which is done every two seconds) were still running when the 
time arrived for their succeeding cycle of computations to begin. 

I' This phenomenon (requesting the EXECUTIVE program to start .* 
some particular job before the previous request of the same job has been 
completed) is a feature of the way the LGC guidance cycle is structured 
and the way sucessive repetitive jobs are scheduled. The execution of jobs 

I 

may be slowed down by RR ECDU memory cycle robbery but the clock which 

schedules jobs like SERVICER and LRHJOB (the job which reads LR range) 

is scarcely affected. The clock (TIMES, the WAITLIST clock) we use to 

schedule these jobs ineluctably counts down to the time for the next repetition 

of a job to begin whether the previous repetition is complete or not. This 

is not the only way to structure our programs but it is the simplest and most 

natural way. There is, of course, danger in this structure, a danger well 

understood and, we thought, provided against. Some computations may 

never be finished if the basic cycle period is to short for the load on the 

central processor. Worse yet, the EXECUTIVE program may be deluged 

with requests for jobs - the new ones and the old still unfinished ones. 

The EXECUTIVE must supply a unshared set of erasables for each job's 

execution from the moment it is started until the moment it is completed. 

The EXECUTIVE has only so many sets of these erasables at its disposal. 

After it has allocated all its sets the next requestor of a job asks for the 

impossible. The EXECUTIVE responds to the impossible request by 

turning on the PROG light (program alarm), storing the 1201 or 1202 alarm 

code, and transferring control to the BAILOUT routine which tries to bail 

the program out of trouble by stopping all jobs and tasks and restarting only 

the absolutely required (in the programmer's limited fallible viewpoint) jobs 

and tasks. Thus, certain astronaut requested activity like extended verbs 

% 

and monitor displays are preemptively terminated (he can call them back). 
This cleaning out of the temporarily dispensable can cure a temporary over- 
load. A permanent overload is the very dangerous situation. 


If there is a danger in this simple natural philosophy the danger can 
be reduced to an acceptable level. We try to do this by simulating all the 
known sources of coinputer load and then to be safe impose a safety margin. 
We simulate an unknown source or sources of memory cycle loss (we call 
it TLOSS) and insist that a certain TLOSS be tolerable (no 1201's, no 1202's). 
The margin has a price - it can be obtained by lengthening the navigation, 
guidance and display period and by trading off execution speed against 
storage in the coding technique. We insisted on a tolerable TLOSS of about 
10% during landing. The coding and the guidance period were therefore 
massaged until 10% TLOSS was tolerable (with a monitor verb running). 
Unfortunately, the RR ECDU's stole about 15% (most of the time). 

T he balance of this letter gives a simple explanation of the 
EXECUTIVE program and the job structure during landing, the interesting 
hardware interface between the RR shaft and .trunnion angle resolvers and 
the LGC when the RR switch is in AUTO TRACK or SLEW, and also what 
we are doing in LUMINARY IB about this interface. Finally, it is not this 
specific problem which we thoroughly understand and have several ways 
of avoiding that really concerns us now. As you asked me in your office 
last Thursday, how can we prevent a similar presently unknown thing from 
happening? The last part of this letter explains some precautions we are 
taking and some disciplines we are invoking.. 

The Executive Program and How 1201 and 1202 Alarms Can Arise 

The job scheduling and job supervising program in the LGC is called 
the EXECUTIVE. When the landing programs must schedule a job they call 
tbe EXECUTIVE program and give it the address of the job to be run and 
the priority for running it. Examples of jobs which arp scheduled repetitively 
during landing are SERVICER, LRHJOB, and LRVJOB (the job which reads 
the velocity measurement from the LR). An example of a one shot job 
scheduled at HI- GATE and therefore called HIGATJOB, is the LR antenna 
re -positioning program. 

People sometimes talk about jobs running "simultaneously”. This 
cannot literally happen since there is only one central processor in the LGC 
and only one job's computations can be done at a time. However, jobs can • 
be simultaneously waiting to be run, and these jobs can be waiting in various 
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stages of completion. 

How does the EXECUTIVE decide which one of a group of waiting 
jobs actually gets control of the central processor? That is the purpose 
of the job priority rating which the caller of the EXECUTIVE program 
must specify along with the address of the job. If a priority 20 job is 
running and a priority 32 job request is made to the EXECUTIVE, the 
EXECUTIVE will temporarily suspend (put to sleep) the priority 20 job 
(at a convenient point of suspension of the suspended job) and initiate the 
priority 32 job. Evidently, there must be an inviolate unshared set of 
erasable registers for each suspended (sleeping) job where its inter- 
mediate results are stored until it oan be re-awakened to finish its 
computaitions. Even the simplest job requires a certain minimum set 
of erasables - this is called a coreset (twelve erasable registers per 
coreset). More sophisticated programming jobs, which perform matrix 
and vector computations and therefore use the INTERPRETER, need an 
additional set of erasables called a VAC area (43 erasable memory cells 
per vac area). (VAC stands for vector accumulation). 

Therefore, every job needs a coreset and some jobs need a VAC 
area. Furthermore, every job must be allocated its required set of un- 
shared erasable registers from the time it begins its computations until 
the time its concludes its computations and does an ENDOFJOB - even 
if it is temporarily suspended and waiting to be awakened in order to 
finish its computations. Jobs which use coreset storage only are called 
NOVAC jobs; those which require a "vector accumulator area" also, are 
called VAC jobs. A table of the jobs which may run during landing, their 
priority, and their type (VAC or NOVAC) is given below. 

Obviously, the EXECUTIVE program cannot respond to an arbitrarily 
large number of requests to schedule jobs. For that would require an 
arbitrarily large number of erasable registers. The LUMINARY IB 
EXECUTIVE program has enough erasable memory to provide 8 coresets 
and 5 VAC areas. ^(COLOSSUS only has 7 coresets.) If the EXECUTIVE 
is requested to schedule more than eight jobs altogether or more than 
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five VAC type jobs it cannot comply. There are several actions which 
the ES^ECUTIVE could take in this overload condition. One possible 
action would be for the EXECUTIVE to terminate abruptly the lowest 
priority waiting job and pre-empt its erasables for the newly requested 
job (or ignoring the request for a new job if its priority is lower than 
the priorities of the already waiting job). Of course, in this kind of 
scheme, a NOVAC job could not be preemptively "de-scheduled" to 
accomodate a VAC job. This philosophy is not really realistic. The 
EXECUTIVE program, then might be making mission affecting decisions. 

It could conceivably ignoi'e the execution of a low-priority but nonetheless 
crucial job. The EXECUTIVE, then, should not make a decision to abort 
or kill a job. 

How, then, can an EXECUTIVE overload reasonably be handled? 

To understand how we handle EXECUTIVE overloads we have to under- 
stand how the computer handles software or hardware restarts. A 
restart causes, among other things, a cleaning out of the EXECUTIVE 
program's queue (and also the WAITLIST program's queue). The 
RESTART program then consults "phase tables" to see what jobs and 
tasks should be restarted. The phase tables are updated ^ the missio n 
programs as the computations proceed. The phase tables do not necessarily 
start a given job over at its beginning; they are more likely to restart the 
cleaned-out program at some point conveniently prior to the point at which 
the restart interruption occurred. If we think of the programming job 
as being composed of blocks or chunks of functional coding, operations 
and computations, then we see that logical points for a restart to begin (that 
is, a logical place to insert a phase change) would be at the beginning of 
the logical block or chunk which the computer is currently executing. 

Note that the advantages of causing a software restart following an 
EXECUTIVE overload are twofold. 

1) Old jobs which were not finished before their successors were 
requested are cleaned-out; i. e. , the backlog, the logjam, is 
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cleaned up. Because SERVICER does its navigation first 
in its cycle and then its guidance computations and then its 
DAP output computations, the state vector maintenance is 

I 

not likely to degrade but the DAP outputs may be old if the 
SERVICERs begin to accumulate. The philosophy of cleaning 
•out the old unfinished SERVICERs and restarting the current 
one is therefore a good idea. 

2) When a restart is executed only the programs, jobs, and 
tasks which the programmers provided for in the "phase 
tables" are restarted. (This, of course, is a two-edged 
sword for the programmer may forget to provide for a 
crucial job or task. ) Temporarily dispensable items like 
astronaut requested monitor verbs or extended verbs are 
not automatically restarted. Thus, there is likely to be an 
immediate reduction in the computer load. The astronaut 
may subsequently re-request his preemptively terminated 
monitor or extended verb. 

Reference to the Table of Jobs and Priorities below shows that 
there are only 8 jobs which can run during lunar landing and that only 
three of these are jobs which use a VAC area. Furthermore, some of 
these jobs, like HIGATJOB and LRHJOB (or LRVJOB) cannot run 
simultaneously. Therefore the 1201 alar'm (not enough VAC areas 
available) and the 1202 alarms (not enough coresets available) must have 
occurred because of multiple scheduling of the same job. Obviously, 
then, if there was a 1201 EXECUTIVE overflow some job like SERVICER 
was scheduled two and possibly three or four times. In effect, the tail 
end of SERVICER v/as not being finished before a new SERVICER was 
scheduled. 

The software restarts which accompanied the 1201 and 1202 alarm 
kept the program from becoming hopelessly snarled up. However, the 
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unusually high frequency of unanticipated "counter increment requests" 
caused the prograin to require three software restarts within about 40 
seconds in P64. 

I 

Long-Range Safer and More Flexible Software 

It ha's been proposed by several people at MIT/IL (Peter Adler and 
Don Eyles) to provide a variable DT SERVICER program which starts a 
new cycle of computations as soon as the old cycle is finished. Our 
present SERVICER has a fixed DT of two seconds and the AVERAGE G, 
FINDCDUW and PIPA and GYRO compensation routines take advantage 
of this value of fixed integral number of seconds chosen for DT. Therefore, 
this programming change is not trivial. Such a mechanization has 
tremendous advantages however. By running all non-SERVICER jobs 
(like extended verbs and monitor verbs) at a higher priority than 
SERVICER, we can get every job done that is requested in one SERVICER 
cycle by making the end of one SERVICER cycle schedule the next SERVICER 
cycle. The computer is continually busy this way but the programs never 
overburden the EXECUTIVE job scheduling capacity. In effect, you have 
the shortest DT possible for the number of tasks and jobs you are trying to 
do. This also allows new computations (PCRs) to be inserted with the 
simple consequence of increasing the average DT. The undesirable effects 
are that the DT varies and verification of stability due to sampling frequency 
Of PIPA reading and output command frequency to the DAP is more 
difficult. 

How the RR Can Rob 15% of the L GC 

The 1201 and 1202 program alarms which occurred during the mission 
G landing were due to a computer overload caused by high frequency shaft and 
trunnion angle "counter" incrementing from the RR ECDU's. The spurious 
erratic high frequency output from the RR ECDUs can occur whenever the 
rendezvous radar control switch is in SLEW or AUTO TRACK (rather than 
LGC) because then the ATCA provides a 15 volt 800 cps signal to the angle 
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resolver primaries which has a random phase relationship to the 800 cps 
which the PSA provides for a reference to the RPi. ECDUs. The consequence 
is. that the ECDUs may count at their maximum rate, 6400 cps, in their 
futile attempts to null the ECDCJ ''ei''ror's". The loss of computation time 
for the two ECDUs counting at maximum rate is 12, 8 X 10^ X 11. 72 X lO"® 

= 0. 15 seconds/ second or 15% of the LGC time since each involuntary 
"counter" increment takes a memory cycle (one memory cycle = 11, 72 
microseconds). 

The Apollo 11 crew checklist specified that the R.R switch be in 
AUTO TRACK immediately before calling P63. We were unaware of the 
effect this w'ould have on the compujer execution time load. 

Preventing Future LGC Memory Cycle Robbery by the RR ECDUs 

When the RR mode switch is in either the AUTO TRACK or SLEW 
positions the RR ECDUs obviously cannot digitize the signal transformed 
from the shaft and trunnion angle resolver primaries to the input to the 
ECDUs. The only time the LGC is interested in the ECDU signal is when 
the RR mode switch is in LGC. Therefore, a PCR has been written, 
approved, and implemented in LUMIMARY IB to monitor the discrete 
(the RR Power On/ LGC Mode discrete) which the LGC receives when the 
RR switch is in the LGC position. When the LGC does not receive this 
discrete, it sets the RR. ECDU zero command which stops the ECDUs' 
attempt to digitize the analog voltage from the resolvers. This action 
effectively prevents "counter increment requests" from the ECDUs and 
prevents central processor execution of these meaningless interrupts. 

It should be mentioned that if the RR DC circuit breaker is pulled 
the LGC does not receive the RR Power On/ LGC Mode discrete, and 
therefore the LGC ^yill'zero the RR ECDUs. This action is correct since 
the LGC is not interested in the shaft and trunnion angle of the RR when 
its circuit breakers are pulled. 
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We may modify our all-digital simulator to send 6400 pps from 
each RR ECDU to the LGC if the RR switch is not in LGC and the LGC 
has not set the RR CDU ZERO bit. Of course, if the simulator had this 
feature for our software verification tests we wt»uld have detected the 
computer overload during our testing. 

In addition to the above PGR, other actions have been taken. 

PGR 81-4 for example, provides for a modified V57 which incorporates a 
display of N68 and avoids the need for a monitor verb and job (MONDO). 

The sepai’ate call of MAKEPLAY has been eliminated by incorpor- 
ating this job into SERVICER. 

A Program to Keep Similar Events Fr om Occurring in the Future. 

There were folks who knew about the RR resolver/ECDU/LGC 
interface mechanization. There were also a great many people who knew 
the effect v/hich a 15% TLOSS would have on the landing program’s operation. 
These folks never got together on the subject. Therefore we need to improve 
communications. 

There are various simulations throughout the country which might 
have detected this problem if they had simulated the RR resolver/ECDU/LGC 
interface with enough fidelity. Therefore, we might look at our various 
simulator configurations to see whether we are leaving anything significant 
out. We have just started a new Simulator Configuration Control Board at 
MIT/IL which will function like your own Apollo Software Configuration 
Control Board. This board v/ill adjudicate the approval of SCRs (simulator 
change requests). We will encourage the writing of SCRs which can 
increase the fidelity of the simulator in significant ways. I hope to con- 
stitute this board with hardware personnel as well as software personnel 
so that important hardware/softw'are interface features are not overlooked. 

Our anomaly prevention plan includes the following steps: 

1. "What if” sessions between hardware and software personnel. 
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2. Orientation lectures by hardware personnel for the benefit 
of software personnel so that programmers and mission 
support personnel better understand the hardware. 

3. A review of the adequacy of tlie sitnulator we use for mission 
verification. Perhaps we can get hardware personnel from 

. GAEC, MSC and MIT/IL to review our current models. 

4. Scrutiny of the crew checklist by the hardware personnel. 

(This could have prevented the alarms. ) 

5. An insistence that anyone who knows anything peculiar about 
the hardware, the hardware/software interface, or the software 
document his inforination in the appropriate place (AOH, GSOP, 
Program Note, etc. ) and tell the appropriate personnel. 


Very truly yours 



George W. Cherry 
Luminary Project Director 
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TABLE OF JOBS AND PRIORITIES 


Name of Job 
and Type 

Description of Job 

Prior;ity of Job 

SERVICER 

(VAC) 

Performs Navigation, Mass Updating, 
Guidance Computations, and Throttle 
and DAP Command Functions 

i 

20 but does a priority change ! 

(PRIOCHNG) to 24 while using | 

subroutine to keep HIGATJOB /\J 
from interf erring (potential 
erasable conflict) ”i 

1 

MAKEPLAY 1 

(VAC) . 1 

Display job set up by SERVICER 

20 j 

1 

1/GYRO 

(NOVAC) 1 

1 

1 IMU compensation job. Compensates 

gyros . 

21 1 

i 

LRHJOB i 

(NOVAC) 1 

1 ■ 

1 ■ Reads the LR range 

i ■ ■ 

1 

32 but they only run for a milli- 
second or so and then sleep for 
about 80 millisecond-s while the 
. LR syne pulses are sent out. They 
are awakened when LR reading is 
completed and then run for another 
millisecond or so. 

LRVJOB 

(NOVAC) 

i 

1 Reads the LR velocities 

i 

1 

i 

CHARIN 

(NOVAC) 

i 

1 

I Started at each DSKY keystroke entry 

1 for interpretation of keystroke. 

30 

1 
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TABLE OF JOBS AND PROBITIES (CONT. ) 


Name of Job 
and Type 

Description of Job 

! 

, 1 

Priority of Job | 

r T-- . ; 

’ 

HIGATJOB 

Commands the LR antenna to re- 

Runs at 32 for a millisecond i 

(VAC) 

position to position #2 and sets up 

or two. Sleeps until position 

1 

! 

j 

a waitlist task to check for appear- 
ance of the position #2 discrete. 

#2 discrete is received. Is 
awakene^ at priority 32 but j 

j 

I 

i 

{ 

{ 

1 

i 

j 

» 

i 

i 

1 

1 

i 

Sleeps during monitor. Is awakened 
after monitor finds position #2 discrete 
and computes, position #2 beam vectors 
for LR Read Routine. 

■ changes\o priority 23 before 1 

calling si^ro^tine TRG'''SMNB j 

Note that r^ERVICER is } 

using QUIOTRIG, HIGATJOB i 

cannot ii^r^re because \ 

.SERVICER isN^^a higher { 

priorit-/ then. Notice that if 1 

HIGATJOB is using TRG-SMNB ! 

SERVICER 'cannot make its | 

priority change to 24 and hence j 

cannot interfere.' j 

. J 

{ 

i 

! ' HONDO 

Monitor Verb Job. Runs during 

CO 

o 

i (NO VAC) 

! . 1 

V16N68. 

I 


Note: Both TRG=':<SMNB and QUICTRIG wse tl^CDUSPOT registers. Hence a potential 
erasable conflict exists if TRG-SMl^^is called while QUICTRIG is executing or 
vice versa. / x 
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IMPORTANT EVENTS OCCURRING 
DURING LUNAR LANDING 


TFI . 

H 

HDOT 

CDUY 

ah 

V 


Time From 
Ignition 
. (seconds) 

(feet) 

(ft/ sec) 

(degrees) 

(feet) 

(ft/ sec) 

DSKY or 

Special Activity 

2 

49971 

- 2.2 

.. lOJ. 25 


5559.7 


232 

42426. 

- 78.7 

80 


3366.4 

Begin Yaw Windows Up 

262-286 






LR Data Good 

286 

37672 

-104. 3 

79 


2745.4 


304 

35706 ■ 

-113. 6 

77 

-2884 

2521. 7 

V.16N68 

316 



77 

-2570 


Program Alarm 

322 



76 

-2354 


V P5N09E (R1 = 1202) 

338 

31566 

-128.8 

75 

-2655 

2086. 1 

V57E 

344 

30772 


73 

-1785 


X Axis Over-ride Denied 

346 



73 

. -1772 


■ V16N68E 

358 



71 

-1826 


Program Alarm (1202) 

372 

24703 

.-113.3 

66 

- 26 

1640.2 


374 . 



66 . 

- 78 


V16N68E 

380 



68 

- 98 


Key Release 

384 

23393 

.- 96.7 

67 

- 86 

1481. 1 

Throttle Down 

396 

22233 

-102. 1 

65 

- 75 

1367. 0 

ENTER 

450 

15273 

-136. 9 

57 

- 48 

961. 0 

ENTER , 

506 

7380 

-125.7 . 

55 


526. 2 

P64 

512 

6636 

-121.4 

47 


LPD 66 

LPD in degrees, LRin Pos #2 

528 

4772 

- 98.2 

43 

- 30 

59 
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